Statistical Disclosure Control: New Directions and Challenges
نویسنده
چکیده
Traditionally, statistical agencies generally release outputs in the form of microdata and tabular data. Microdata contain data from social surveys and tabular data contain either frequency counts, such as for census dissemination, or magnitude data typically arising from business surveys, eg. total revenue. For each of these traditional outputs, there has been much research on how to quantify disclosure risk, optimal statistical disclosure control (SDC) methods and how to assess the impact on data utility. For traditional outputs, the two main disclosure risks are identity disclosure where a statistical unit can be identified based on a set of identifying variables and attribute disclosure where new information can be learnt about an individual or a group of individuals. One disclosure risk that is often overlooked in traditional statistical outputs is inferential disclosure. This disclosure risk has to do with learning new attributes with high probability. For example, a regression model with a very high predictive power may cause inferential disclosure. Even if an individual is not in the dataset, there would still be disclosure from this type of disclosure risk. Another example of inferential disclosure is disclosure by differencing when multiple releases are disseminated from one data source. For example, census tables can be differenced or manipulated to reveal individual units. For traditional hard-copy census tables, disclosure by differencing is controlled by having a fixed set of variables and categories which disallow differencing non-nested groups of individuals. In Section 2 of this paper we provide more discussion of inferential disclosure and define differential privacy as developed in the computer science literature for the protection of outputs in on-line query systems. Inferential disclosure is now a key risk that statistical agencies need to consider when developing new online and remote strategies for disseminating statistical outputs. Section 3 describes recent advances in data dissemination with some examples. Section 4 concludes with a discussion.
منابع مشابه
Tabular Statistical Disclosure Control: Optimization Techniques in Suppression and Controlled Tabular Adjustment1
The problem of disseminating tabular data such that the amount of information provided satisfies the public need while protecting individually identifiable data is a problem in all governmental statistical agencies. The problem falls into the category of Statistical Disclosure Control and provides many difficult policy and technical challenges for these agencies. In order to achieve the double ...
متن کاملSecurity and privacy for database systems
Database security is a discipline that seeks methods to protect data stored at DBMSs from intrusions, improper modifications, theft, and unauthorized disclosure of private information. This is realized through a set of security services, which meet the security requirements of both the system and the data sources. A number of different techniques and approaches has been developed to assure data...
متن کاملData-swapping -- a Technique for Disclosure Control
In recent years there has been increasing concern about the confidentiality of computerized databases. This has led to a fast growing interest in the development of techniques for controlling the disclosure of information from such databases about individuals, both natural and legal persons. 3 Much of this interest has been focused on the release of statistical tabulations and microdata files. ...
متن کاملStatistical Disclosure Control for Data Privacy Preservation
With the phenomenal change in a way data are collected, stored and disseminated among various data analyst there is an urgent need of protecting the privacy of data. As when individual data get disseminated among various users, there is a high risk of revelation of sensitive data related to any individual, which may violate various legal and ethical issues. Statistical Disclosure Control (SDC) ...
متن کامل